EXPECTATION, CONDITIONAL EXPECTATION AND 
MARTINGALES IN LOCAL FIELDS 

STEVEN N. EVANS AND TYE LIDMAN 

Abstract. We investigate a possible definition of expectation and 
conditional expectation for random variables with values in a lo- 
cal field such as the p-adic numbers. We define the expectation by 
analogy with the observation that for real- valued random variables 
in L 2 the expected value is the orthogonal projection onto the con- 
stants. Previous work has shown that the local field version of L°° 
is the appropriate counterpart of L 2 , and so the expected value of 
a local field-valued random variable is defined to be its "projec- 
tion" in L°° onto the constants. Unlike the real case, the resulting 
projection is not typically a single constant, but rather a ball in 
the metric on the local field. However, many properties of this ex- 
pectation operation and the corresponding conditional expectation 
mirror those familiar from the real-valued case; for example, condi- 
tional expectation is, in a suitable sense, a contraction on L°° and 
the tower property holds. We also define the corresponding notion 
of martingale, show that several standard examples of martingales 
(for example, sums or products of suitable independent random 
variables or "harmonic" functions composed with Markov chains) 
have local field analogues, and obtain versions of the optional sam- 
pling and martingale convergence theorems. 



1. Introduction 

Expectation and conditional expectation of real-valued random vari- 
ables (or, more generally, Banach space-valued random variables) and 
the corresponding notion of martingale are fundamental objects of 
probability theory. In this paper we investigate whether there are anal- 
ogous notions for random variables with values in a local field (that is, 
a locally compact, non-discrete, totally disconnected, topological field) 
- a setting that shares the linear structure which underpins many of 
the properties of the classical entities. 

The best known example of a local field is the field oip-adic numbers 
for some positive prime p. This field is defined as follows. We can write 
any non-zero rational number r G Q\{0} uniquely as r = p s (a/b), with 
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a, b, and s integers, where a anc 
If we set 1 1 = 0, then the map 

\x 

(1) \xy 

\x + y 



b are not divisible by p. Set |r| = p 
■ | has the properties: 

= ^x = 

= MM 

< \x\ V \y\. 



The map (x, y) \— > |x — y| defines a metric on Q and we denote the 
completion of Q in this metric by Q p . The field operations on Q extend 
continuously to make Q p a topological field called the p-adic numbers. 
The map | • | also extends continuously and the extension has properties 

(!)• 

The closed unit ball around 0, Z p = {x G Q p : \x\ < 1}, is the closure 
in Q p of the integers Z, and is thus a ring (this is also apparent from 
(1)), called the p-adic integers. As Z p = {x e Q p : \x\ < p}, the set Z p 
is also open. Any other ball around is of the form {x G Q p : \x\ < 
p~ k } = p k 7L p for some integer k. 

Every local field is either a finite algebraic extension of the p-adic 
number field for some prime p or a finite algebraic extension of the 
p- series field; that is, the field of formal Laurent series with coefficients 
drawn from the finite field with p elements.) A locally compact, non- 
discrete, topological field that is not totally disconnected is necessarily 
either the real or the complex numbers. 

From now on, we let IK be a fixed local field. Good general reference 
for the properties of local fields and analysis on them are [Sch84, vR78, 
Tai75]. The following are the properties we need. 

There is a real- valued mapping x i— > \x\ on K called the non- 
archimedean valuation with the properties (1). The third of these 
properties is the ultrametric inequality or the strong triangle inequality. 
The map (x, y) \— > \x — y\ on K x K is a metric on K which gives the 
topology of K. A consequence of of the strong triangle inequality is 
that if \x\ 7^ \y\, then \x + y\ = \x\ V \y\. This latter result implies 
that for every "triangle" {x, y, z} C K we have that at least two of the 
lengths \x — y\, \x — z\, \y — z\ must be equal and is therefore often 
called the isosceles triangle property. 

The valuation takes the values {q k : k G Z} U {0}, where q = p c 
for some prime p and positive integer c (so that for K = Q p we have 
c = 1). Write © for {x G K : \x\ < 1} (so that © = Z p when K = Q p ). 
Fix p G IK so that \p\ = g -1 . Then 



p k B = {x : \x\ < q~ k } = {x : \x\ < q- {k - 1} } 



LOCAL FIELD EXPECTATION 



3 



for each k G Z (so that for K = Q p we could take p — p). The set D is 
the unique maximal compact subring of K (the ring of integers of IK). 
Every ball in IK is of the form x + p fc D for some x G D and fc G Z. If 
£> = rr + p fc © and C = y + //© are two such balls, then 

• B n C = 0, if \x - y| > q~ k V g^, 

• .B C C, if |x — y\ V < q~ e , 

• C C 5, if \x-y\ V q~ e < q- k . 

In particular, if q~ k = q~ E , then either B(~)C = ®otB = C, depending 
on whether or not \x — y\ > q~ k = q~ l or \x — y\ < q~ k = q~ l . 

We have shown in a sequence papers [Eva89, Eva91, Eva93, Eva95, 
EvaOlb, EvaOla, Eva02, Eva06] that the natural analogues on IK of the 
centered Gaussian measures on R are the normalized restrictions of 
Haar measure on the additive group of K to the compact the balls p fc D 
and the point mass at 0. There is a significant literature on probability 
on the p-adics and other local fields. The above papers contain numer- 
ous references to this work, much of which concerns Markov processes 
taking values in local fields. There are also extensive surveys of the 
literature in the books [Khr97, KocOl, KN04]. 

It is not immediately clear how one should approach defining the 
expectation of a local field valued random variable X. Even if X 
only takes a finite number of values {xi,x 2 , ■ ■ ■ ,x n }, then the object 
J2k x k^{X = Xk} doesn't make any sense because Xk G K whereas 
P{X = Xk} G KL However, it is an elementary fact that if T is a 
real-valued random variable with E[T 2 ] < oo, then c h- > E[(T — c) 2 ] 
is uniquely minimized by c = E[T]. Of course, since this observation 
already uses the notion of expectation it does not lead to an alternative 
way of defining the expected value of a real-valued random variable. 
Fortunately, we can do something similar, but non-circular, in the local 
field case. 

Fix a probability space (Q, J 7 , P). By a K- valued random variable, we 
mean a measurable map from Q equipped with T into K equipped with 
its Borel u-field. Let L°° be the space of K- valued random variables X 
that satisfy ||^||oo : = ess sup \X\ < oo. It is clear that L°° is a vector 
space over K. If we identify two random variables as being equal when 
they are equal almost surely, then 



11x1(00 = o^x = o 

110X1100 = 10111X1100, c G IK, 

^ + ^||oo 5: ||X II oo V II V||oo- 
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The map (X,Y) i— > ||X — V||oo defines a metric on L°° (or, more cor- 
rectly, on equivalence classes under the relation of equality almost ev- 
erywhere), and L°° is complete in this metric. Hence L°° is an instance 
of a Banach algebra over IK. 

It is apparent from the papers on analogues of Gaussian measures 
cited above that L°° is the natural local field counterpart of the real 
Hilbert space L 2 . In particular, there is a natural notion of orthogo- 
nality on L°° (albeit one which does not come from an inner product 
structure). 

Definition 1.1. Given X G set e(X) = wf{\\X - c||oo : c G K}. 
The expectation of the K-valued random variable X is the subset of K 
given by 

E[X] := {c G K : \\X - = e(X)} = {c G K : \\X - < e(X)}. 

We show in Section 2 that E[X] is non-empty. Note that if d G K[X] 
and c" G K is such that \c" — d\ < e(X), then, by the strong triangle 
inequality, c" G K[X]. Thus K[X] is a (closed) ball in K (where we take 
a single point as being a ball). 

Observe that we use the same notation for expectation of K-valued 
and M-valued random variables. This should cause no confusion: we 
either indicate explicitly whether a random variable has values in K or 
M, or this will be clear from context. 

The outline of the rest of the paper is the following. We show in 
Section 2 that the expected value of a random variable in L°° is non- 
empty, remark on some of the properties of the expectation operator, 
and motivate the definition of conditional expectation by considering 
the situation where the conditioning a-field is finitely generated or, 
more generally, has an associated regular conditional probability. The 
appropriate definition of the conditional expectation of X G L°° given a 
sub-a-field Q C T is not, as one might first imagine, the L°° projection 
of X onto L°°(Q) (:= the subspace of L°° consisting of ^-measurable 
random variables). For this reason, we need to do some preparatory 
work in Sections 3 and 4 before finally presenting the construction 
of conditional expectation in Section 5 and describing its elementary 
properties in Section 6. We establish an analogue of the "tower prop- 
erty" in Section 7 and obtain a counterpart of the fact for classical 
conditional expectation that conditioning is a contraction on L 2 (both 
of these results need to be suitably interpreted due to the conditional 
expectation being typically a set of random variables rather than a sin- 
gle one). We introduce the associated notion of martingale in Section 9 
and observe that several of the classical examples of martingales have 
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local field analogues. We develop counterparts of the optional sam- 
pling theorem and martingale convergence theorem in Sections 10 and 
11, respectively. 

Note: We adopt the convention that all equalities and inequalities 
between random variables should be interpreted as holding P-almost 
surely. 

2. Expectation 

Theorem 2.1. The expectation of a random variable X G L°° is non- 
empty. It is the smallest closed ball in K that contains suppX (the 
closed support of X). 

Proof. By the strong triangle inequality \\X — c||oo < ||X||oo V |c|, and 
\\X — c||oo = |c| for |c| > HXjloo. Therefore, the infimum of c ^ 
\\X — c\\oo over all c G IK is the same as the infimum over {c G IK : |c| < 
||X||oo} and any point c G IK at which the infimum of is achieved must 
necessarily satisfy |c| < ||-X"||oo. That is, e(X) = inf{||X — c\\oo : \c\ < 
\\X\\ 0O }aadE[X] = {c:\c\<\\X\\ O0 , \\X-c\\ O0 = e(X)}. 

Again by the strong triangle inequality, the function c h- > \\X — c||oo 
is continuous. Consequently, E[X] is non-empty as the set of points at 
which a continuous function on a compact set attains its infimum. 

As we observed in the Introduction, K[X] is a ball of radius (= 
diameter) e(X). If x G suppX is not in K[X] and c is any point 
in E[X], then, by the strong triangle inequality, \x — c| > e(X) and 
\\X — c||oo > e(X), contradicting the definition of E[X}. Thus suppX C 
E[X]. Hence, if the smallest ball containing suppX is not E[X], it 
must be a ball contained in E[X] with diameter r < e(X). However, 
if c is any point contained in the smaller ball, then \x — c\ < r for all 
x G suppX, contradicting the definition of e(X). □ 

Our notion of expectation shares some of the features of both the 
mean and the variance of a real-valued variable. Any point in the 
ball E[X] is as good a single summary of the "location" of X as any 
other, whereas the diameter of E[X] (that is, e(X)) is a measure of the 
"spread" of X. 

Some properties of E[X] are immediate. It is easily seen that for 
constants k, b G IK, E[kX + b] = kE[X] + b. We do not have complete 
linearity, however, since E[X + Y\ is only a subset of E[X] + E[Y], 
with equality when X and Y are independent. This follows from 
the fact that supp(X + Y) C suppX + suppF, with equality when 
X and Y are independent. Also, if X and Y are independent, then 
E[Xy] = E[X]E[F]. These remarks further support our assertion that 
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E[X] combines the properties of the mean and the variance for real- 
valued random variables. 

Define the Hausdorff distance between two subsets A and B of K to 

be 

cIh(A, B) : = sup inf \a — b\ V sup inf \b — a\. 

aeA beB beB aeA 

We know from Theorem 2.1 that E[X] and E[Y] are balls with di- 
ameters e(X) and s(Y), respectively. We have one of the alternatives 
E[X] = E[Y], E[X] C E[Y], E[Y] C E[X], or E[X]nE[F] = 0. Suppose 
that E[X] C E[Y], so that suppX C E[X] and there exists y G suppF 
such that y is not in the unique ball of diameter q^eiY) containing 
E[X]. Then, by the strong triangle inequality, \x — y\ = s(Y) for all 
x G suppX, and so d H (suppX, suppF) > s(Y) = d H (E[X],E[Y]) in 
this case. Similar arguments in the other cases show that 

d H (E[X],E[Y}) < d H (suppX,suppy) < p"- YHoo. 

This is analogous to the continuity of real-valued expectation with re- 
spect to the real L p norms. 

Rather than develop more properties of expectation, we move on to 
the corresponding definition of conditional expectation because, just as 
in the real case, expectation is the special case of conditional expec- 
tation that occurs when the conditioning a-field is the trivial a-field 
{0, Q}, and so results for expectation are just special cases of ones for 
conditional expectation. 

In order to motivate the definition of conditional expectation, first 
consider the special case when the conditioning a-field Q C T is gen- 
erated by a finite partition {Ai, A 2 , . . . , A n } of Q. In line with our 
definition of E[X], a reasonable definition of E[X \ Q\ would be the set 
of ^-measurable random variables Y such that for each k the common 
value of Ck :— Y{uj) for u e A k satisfies 

esssup{|X(a;) — Ck\ : G Ak} = inf ess sup{|X(cj) — c| : uj G A^}. 

Equivalently, suppose we define e(X,Q) to be the ^-measurable, R- 
valued random variable that takes the value inf ceK ess sup{|X(cj) — 
c| : uj G Ak} on A^, then E[X \ Q] is the set of ^-measurable random 
variables Y such that \X - Y\ < e(X,Q). Note that e(X, {0,fi}) = 
e(X) and E[X | {0,fi}] =E[X]. 

More generally, suppose that Q C T is an arbitrary sub-a-field and 
there is an associated regular conditional probability Pg(u', duo") (such 
a regular conditional probability certainly exists if Q is finitely gener- 
ated). In this case, we expect that E[X \ G]{oj') should be the expecta- 
tion of X with respect to the probability measure ~Pg(uj', •). It is easy 
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to see that if we let e(X, Q) be the ^-measurable random variable such 
that e(X, G){uj') is the infimum over c G IK of the essential supremum 
of \X — c\ with respect to ¥g(ui', •), then this definition of e(X, Q) sub- 
sumes our previous one for the finitely generated case and our putative 
definition of E[X | Q] coincides with the set of ^-measurable random 
variables Y such that \X — Y\ < e(X,Q), thereby also extending the 
definition for the finitely generated case. 

We therefore see that the key to giving a satisfactory general defini- 
tion of E[X | Q\ for an arbitrary sub-cr-field Q C T is to find a suitable 
general definition of e(X, Q). We tackle this problem in the next three 
sections. 



3. Conditional essential supremum 

Definition 3.1. Given a non-negative real-valued random variable S 
and a sub-cr-field Q C JF, put 

esssup{S|£} = supE[S p \g}? = lim E[S p \g}?. 

P >i p^°° 

Lemma 3.2. (i) Suppose that S is a non-negative real-valued 
random variable and Q is a sub-a-field of T . Then S < 
ess supj.S | Q}. 

(ii) Suppose that S and Q are as in (i) and T is Q -measurable real- 
valued random variable with S < T. Then esssupjS* | Q} < T . 

(iii) Suppose that S' and S" are non-negative real-valued random 
variables and Q is a sub-o -fields of T . Then 

esssupjS' V S" | G} = esssupjS' | Q} V esssupjS" | Q}. 

Proof. For part (i), we show by separate arguments that the result 
holds on the events {esssup{>S | Q} = 0} and {esssupIS* | Q} > 0}. 

First consider what happens on the event {esssupIS 1 1 Q} = 0}. By 
definition E[S \ Q\ < esssupjS 1 1 Q}. Hence 

E[S l{ess sup{,S | g} = 0}] < E[S 1{E[S \ Q\ = 0}] 

= e[e[s i{E[s | g] = o} I g\] 
= E[i{E[s\g] = o}E[s\g]] = o. 

Thus {esssupIS* | g} = 0} C {S = 0}, and S < esssupjS* | g} on the 
event {esssupjS* | g} = 0}. 



8 



STEVEN N. EVANS AND TYE LIDMAN 



Now consider what happens on the event {esssupIS* | Q} > 0}. Take 
a > 1. Observe for p > 1 that 

E[S P | Q\ > E[S P 1{S P > a p E[S p | Q\} \ Q\ 

> E[a p E[S p | Q\ 1{S P > a p E[S p \G]}\g] 
= a p E[S p | Q\ F{S P > a p E[S p \ Q\ \ Q}. 

Hence, for each p > 1, 

F{S > a ess sup{S | Q} \ Q} < F{S > a E[S P \ Q] p \ G} < — 

on the event {E[S P | Q\ > 0}. 

Since {ess sup{S | £} > 0} C \J p f) q > p {E[S q \ Q\ > 0}, we see that 
F{S > a esssupjS 1 1 Q} \ Q} = on the event on {esssupIS* | Q} > 0}. 
As this holds for all a > 1, we conclude that S < ess sup {S \ Q} on the 
event {esssupjS* | Q} > 0}, and this completes the proof of part (i). 

Part (ii) is immediate from the definition. 

Now consider part (iii). We have from part (i) that S' < 
ess sup{S' | Q} and S" < ess sup{S" \ Q}. Thus S'VS" < ess sup{,S' | G}V 
ess sup {S" | Q} and hence 

ess sup{,S' V S" | G} < ess sup{,S' | Q} V ess sup{,S" | Q} 

by part (ii). On the other hand, because S' < S' V S" and S" < 
S' V S", it follows that esssupjS' | Q} < esssupj^' V S" \Q} and 
ess sup {S" | Q} < ess sup {S' V S" \ Q}. Therefore 

ess sup{S' | Q} V ess sup{S" \ Q} < ess sup{S' V S" \ Q}. 

□ 

Corollary 3.3. Suppose that S is a non-negative real-valued random 
variable and Q C Tl are sub-a-fields of T . Then ess supjS* | 7i\ < 
ess supj^ | §}. 

Proof. From Lemma 3.2(i), S < ess supjS* | Q}. Applying Lemma 
3.2(h) with Q replaced by H and T = ess sup{S' | Q} gives the result. □ 

Let {J r n }^L Q be a filtration (that is, a non-decreasing sequence of 
sub-a-fields of T\ Recall that a random variable T with values in 
{0, 1,2,.. .} is a stopping time for the filtration if {T = n} G T n for all 
n. Recall also that if T is a stopping time, then the associated a-field 
Tt is the collection of events A such that A n {T = n} E T n for all n. 

Lemma 3.4. Suppose that S is a non-negative real-valued random vari- 
able, {JF„}^L is a filtration of sub-a-fields of T , and T is a stopping 



LOCAL FIELD EXPECTATION 



9 



time. Then 

esssupjS 1 1{T = n} | JF T } = 1{T = n} ess supjS* | JF T } 
= 1{T = n} ess supjS* | jF n } = ess sup{>S 1{T = n} | 

/or a// n. 

Proof. This follows immediately from the definition of the conditional 
essential supremum and the fact that if U is a non-negative real-valued 
random variable, then ess sup{?7|jF T } = esssup{f/|jF n } on the event 
{T = n} (see, for example, Proposition II- 1-3 of [Nev75]). □ 



4. Conditional L°° norm 

Definition 4.1. Given X G L°° and a sub-a- field J 7 , put 

:= esssup{|X| | ^}. 

Notation 4.2. Given A G JF, the K-valued random variable 1a is given 
by 

if to E A, 



1a(w) = 



Ok, otherwise, 



where Ik and Ok are, respectively, the multiplicative and additive iden- 
tity elements of K. We continue to use this same notation to also denote 
the analogously defined real- valued indicator random variable, but this 
should cause no confusion as the meaning will be clear from the context. 

Lemma 4.3. Fix a sub-a-field Q C T . 

(i) IfW G L°°(g) andX G L°° , then \\WX\\ g = \W\ \\X\\ g . 

(ii) If X,Y G L°° are such that P({X ^ Y} n A) = /or some 
A EG, then P({||X|| g ^ n A) = 0. 

(iii) // Xi, X 2 , . . . G L°° and Ai, A 2 , . . . G Q are pairwise disjoint, 
then 



(iv) IfX,YeL°°, then 

\\X + Y\\g < \\X\\gV \\Y\\g. 

Proof. Part (i) follows immediately from the definition. Part (ii) follows 
from part (i): since X 1a = Y 1a by assumption, 



l A \\X\\ a = \\Xl A \\g = \\Yl A \\a = U\\Y 



A\\g 
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Part (iii) follows from parts (i) and (ii): for any of the events Aj, 
lAj E lA iW Xi We = 1 A J \\Xj\\g = II l-AjXjWg 



E 



E 



and, similarly, £\ = l^^illg on \ (UjA)- 

Part (iv) is an immediate consequence of Lemma 3 .2 (iii) . However, 
there is also the following alternative, more elementary proof. Note 
first that ||X r || g = \\X\\ g for any r > because 

lim E[|X| rp | Q\* = lim E[\X\ q | 0]i = ( lim E[\X\ q \ g]«) r . 



p^oo 



q^oo 



q—*oo 



Thus, from Jensen's inequality and the observation that (x + y) s < 
(x s + y s ) for < s < 1, 

||x + y || g = lim e[\x + y| p | g]p 



< lim E[|X| P V |y| p |^]p 



lim E[lim(|X| pr + |yn 

p-^oo r— +oo 



0)* 



< lim (E[|X| pr |£] +E[|yn^])^ 

< lim (E[|X| rp |£]p +E[|y| rp |^]p)^. 

p,r— *oo 

= lim (HXII&+ ire? 

r— »oo 

= llxiio v ||y|| c . 



The following result is immediate from Corollary 3.3. 



□ 



Lemma 4.4. Suppose that X e and Q CH are sub-o -fields of T . 
Then \\X\\ H < \X\ g . 

The following result is immediate from Lemma 3.4. 

Lemma 4.5. Suppose that X e {JF„}^L is a filtration of sub-a- 
fields of T , and T is a stopping time. Then 

||Xl{T = n}||^ T = l{T = n}||X||^ T 

= l{T = n}\\X\\f n = \\Xl{T = n}\\f n 

for all n. 



LOCAL FIELD EXPECTATION 



11 



5. Construction of Conditional Expectation 

Definition 5.1. Given X G L°° and a sub-a-field Q C JF, set 

E[X | 0] := {F G : \\X - Y\\g < \\X - Z\\g for all Z G 

Remark 5.2. Before showing that E[X | £/] is non-empty, we comment 
on a slight subtlety in the definition. One way of thinking of our def- 
inition of E[X] as the set of c G IK for which \\X — c||oo is minimal, 
is that E[X] is the set of projections of X onto K = L°°({0,fi}). A 
possible definition of E[X \ Q\ might therefore be the analogous set of 
projections of X onto L°°(Q), that is, the set of Y G L°°(Q) that 
minimize \\X — F||oo- This definition is not equivalent to ours. For 
example, suppose that Q consists of the three points {a,^,^}, T con- 
sists of all subsets of f2, P assigns positive mass to each point of fi, 
Q = a{{a,(3}, {7}}, and X is given by X(a) = 1 K , X{(3) = K , and 
X(j) = Or. Consider Y G L°°(0), so that = y(/3) = c and 

y (7) = c? for some c, d G K. In order that Y G E[X | Q] according to 
our definition, c and d must be chosen to minimize both | Ik— c|V|0k — c\ 
and I Ok — d\. By the strong triangle inequality, |1k — c\ V |0k — c\ is 
minimized by any c with |c| < 1, with the corresponding minimal value 
being 1. Of course, |0k — d\ is minimized by the unique value d = Ok- 
On the other hand, in order that Y is a projection of X onto L°°(Q), 
the points c and d must be chosen to minimize |1k — c\ V|0k— c\ V|0k— d\, 
and this is accomplished as long as \c\ < 1 and \d\ < 1. We don't bela- 
bor the point in what follows, but several of the natural counterparts 
of standard results for classical conditional expectation that we show 
hold for our definition fail to hold for the "projection" definition. 

The following lemma is used below to show that E[X | Q\ is non- 
empty. 

Lemma 5.3. Suppose that X G L°° is not Ok almost surely, and Q 
is a sub-a-field of J 7 . Set q~ N = ||^||oo- Then there exist disjoint 
events A , Ai, . . . G Q and random variables Y , Y\ , . . . G L°°(Q) with 
the following properties: 

(1) On the event A n , \\X - Z\\ g > q^ N+n ) for every Z G L°°(Q). 

(2) On the event A n , \\X - Y n \\ g = q~( N+n ) . and 

(3) On the event Q \ [f k=1 A k , \\X - Y n \\ Q < q-( N+n +V 

(4) On the event Ufe=i Ak, Y p = Y n for any p > n. 

(5) The event Ufcli has probability one. 

Proof. Suppose without loss of generality that ||-X"||oo — 1> so that 
N = 0. Set Z := {Z G L°°{g) : ||X - Z^ < 1}. Note that the 
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constant belongs to Z and so this set is non-empty. Put 5o : = 
inf ZGZo P{||X-Z|| g = l}. 

Choose Z 01 , Z 0>2 , • • • G Z with 

lim P{||X-Z 0iro || = 1} =S . 

m— >oo 

Define Z' 01 , Z' 02 , . . . inductively by setting Z' Q1 := Z 0i i and 

z , ^ . = j Z 0,mH> if \\ X ~ Z 0,m\\s(u) < \\X - Z 0im+1 \\ g (u), 
°' m+1 ' |VlH. if \\ X ~ Z 0,m\\s(u) > \\X - Z 0)m+1 \\g(u). 

Note that the events B 0:Jn := {\\X — Z' 0m \\g = 1} are decreasing and 
the B 0m are contained in the event {||X — Z Qtm \\g = 1}. Hence the 
event A : = lim^^ B^ m = f|™ =1 B 0jTn has probability 5 . 

Define Y by 



Y (u) :-- 



Z'^u), ifue(n\B 0<1 )UA , 
Z' 0im (uj), iiue(n\ B , m ) \ (n \ S , m _i), m > 2. 

It is clear that \\X — Y \\g = 1 on the event A and \\X — lolls < <? _1 
on the event Q \ A . Moreover, if there existed V G L°°(Q) with 

F({\\x-v\\g< q - 1 }r)A )>o, 

then we would have the contradiction that W G Z defined by 



W{u) 



Y (u), i£\\X-Y \\g(u)<\\X-V\\g{u) 
V(u), if \\X-Y \\g(u) > \\X-V\\g(u) 



would satisfy F{\\X - W\\ g = 1} < 5 . 

Now suppose that A , . . . A n _i and Y , . . . , Y n _ x have been con- 
structed with the requisite properties. If P(f2 \ Ufcli) = 0> then take 
A n = and F„ = F n _ x (recall that we are interpreting all equalities 
and inequalities as holding P-a.s.) Otherwise, set 

s n—1 

Z n :=lze :Z = Y n _ x on |J A k 

^ k=i 

n—l n 

and \X-Z\< q~ n on ft \ (J A k I. 

fe=i ' 

Note that F n _i belongs to Z n . Put 5 n := inf ZeZn P{||X - = 
g~ n }. An argument very similar to the above with Z n and 5 n replacing 
Z and S establishes the existence of A n and Y n with the desired 
properties. □ 
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Theorem 5.4. Given X G L°° and a sub-a -algebra Q C J 1 ' , the condi- 
tional expectation E[X \ Q] is nonempty. 

Proof. If X is K almost surely, then E[X | Q] = {0 K }. Otherwise, let 
A ,A 1 ,... G Q and F , Y h . . . G L°°(g) be as in Lemma 5.3. Then Y 
defined by Y(w) = Y n (u) for u G A n belongs to ELY | (?]. □ 

6. Elementary Properties of Conditional Expectation 

Proposition 6.1. Fix a sub-a-field Q C JF. 

(i) Suppose that X G L°°(£) and F G Tnen 

E[XF|£] = XE[F|£]. 

and 

E[X + F|£] =X + E[Y\Q]. 

(ii) // X, Y G L°° are such that P({X ^ F} n A) = /or some 
A eg, then 1 A E[X \ Q\ = 1 A E\Y \ Q\. 

(iii) // Xi, X 2 , . . . G L°° and Ai, A 2 , . . . G Q are pairwise disjoint, 
then 



E 



Proof. Consider part (i). We first show the inclusion E[XF | Q] C 
XE[Y\g\. 

Consider Z G E[XF | Q}. Choose some F G E[F | 0] and set W = 
(Z/X) 1{X + 0} + V 1{X = 0} G L°°(£). Note that P{Z ^ 0, X = 
0} = and hence XW = Z, because otherwise we would have the 
contradiction \\XY - Z 1{X ^ 0}\\ g < \\XY - Z\\ g and P{||XF - 
Z 1{X + 0}|| 6 < \\XY - Z\\ g } > by Lemma 4.3(h). 

We need to show that W G E[F | Q\. Consider U G L°°(</). By 
Lemma 4.3(h) and the assumption that V G E[F | Q], 

\\Y-W\\g=\\Y-V\\g< \\Y-U\\g 

on the event {X = 0}. Also, \\XY - Z\\ g < \\XY - XU\\g by the 
assumption that Z G E[XF | Q], and so, by Lemma 4.3(i) + (ii) 

\\Y - W\\g = \\Y - Z/X\\g = {X^WXY - Z\\g 

< {X^WXY - XU\\g = ||F - U\\g 

on the event {X ^ 0}. Thus ||F-lF|| g < \\Y -U\\g for any U G L°°(£) 
and hence WgE[F|(/]. 

We now show the converse inclusion XE[F | Q\ C E[XF | Q\. 

Choose W G E[F|£]. We need to show that XW G E[XF|£]. 
Consider U G Put V = (U/X) 1{X ^ 0}. We have ||F - 
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W\\g < \\Y — V || g by the assumption that W G E[Y | £/]. From Lemma 
4.3(i) + (ii), 

||XY - XPy|| g = \X\\\Y - W\\g < \X\\\Y - V\\g = \\XY - XV\\g 
= \\XY-U\\gl{X^O}<\\XY-U\\g, 

as required. 

The proof of the claim E[X + Y\Q] = X + E[Y \ Q\ is similar but 
easier, so we omit it. 

Parts (ii) and (iii) follow straightforwardly from parts (ii) and (iii) 
of Lemma 4.3. □ 

Proposition 6.2. Let Q be a sub-o -algebra of T . Suppose that X e 
L°° is independent of Q . Then K[X \ Q] is the set of random variables 
Y e L°°(g) that take values in ELY]. 

Proof. Observe for any Z G L°°(Q), that, by the assumption of inde- 
pendence of X from Q, 

\\X - Z\\ g (u) = sup (E[\X - Z\ p I Q\(u;))> 



sup {^f \ x ~ Z ( UJ )\ P ¥ i X e dx ) 
sup{|x — Z{uj)\ : x G suppX} 



--e(X), if Z(uS) G E[X], 
> e(X), otherwise, 

and the result follows. □ 

7. Conditional spread and the tower property 

Definition 7.1. Given X G L°° and a sub-a-field Q of J 7 , let e(X,Q) 
denote the common value of \\X — Y\\g for Y G E[X \ Q\. 

Lemma 7.2. If X G L°° and a Q C 7i are sub-o -fields of T , then 
e(X,n)<e(X,g). 

Proof. Suppose that V G E[X \ Q] and W G E[X \ H\. From Lemma 
4.4, 

e(X,H) = \\X-W\\ H < \\X-V\\ H < \\X - V\\ g = e(X, Q). 

□ 

Lemma 7.3. A random variable Y belongs to E[X \ Q] if and only if 
Y G L°°(g) and\X -Y\ <e{X,Q). 
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Proof. Suppose Y is in E[X | Q\. By definition, Y G L°°(Q). By Lemma 
Lemma 4.4, \X — Y\ — \\X — Y\\^ < \\X — Y\\g = e{X, Q). 

The converse is immediate from Lemma 3.2(ii). □ 

Lemma 7.4. Suppose that X G L°° , Q CH are sub-a -fields of T , and 
Y G E[X I H}. Then e(Y, Q) < e(X, Q) . 

Proof. Consider Z G E[X \Q\. By Lemma 7.3 and Lemma 7.2 

|Y~ - Z| < \X - Y\ V \X - Z\ < e(X, H) V s(X, Q) = e(X, Q). 
By Lemma 3.2(H), e(Y, Q) < \\Y - Z\\ g < e(X, Q). □ 

Theorem 7.5. Suppose that X G L°° and Q C Tt are sub-a -fields of 
T. IfY G E[X | H\ and Z eE[Y\ Q\, then Z G E[X \ Q\. 

Proof. By Lemma 7.3, Lemma 7.4, and Lemma 7.2, 

\X - Z\ < \X - Y\ V \Y - Z\ < e(X, H) V e(Y, Q) < e(X, Q). 
Thus Z is in E[X \ Q] by another application of Lemma 7.3. □ 

8. Continuity of conditional expectation 

Definition 8.1. Define the Hausdorff distance between two subsets A 
and B of L°° to be 

D H (A, B) := sup inf ||X - FIL V sup inf \\Y - XIU. 

Lemma 8.2. Suppose that A, B,C are subsets of L°° . Then 
D H (A + C,B + C)<D H (A,B). 

Proof. Suppose that D H (A,B) < 5 for some 5 > 0. By definition, for 
every X G A there is a Y G B with \\X — V ||oo < 8, and similarly with 
the roles of A and B reversed. If U G A + C, then U = X + W for some 
X e A and W & C. We know there is F G B such that Y||oo < 5. 
Note that V:=Y + W<EB + C and ||[/ — VIU = ||X - Y^ < 5. 
A similar argument with the roles of A and B reversed shows that 
D H (A + C,B + C) < 5. □ 

Theorem 8.3. Suppose that X, Y G L°° and Q is a sub-a-field of T. 
Then D H (E[X\g],E[Y\g}) < \\X - Y^ . 

Proof. Choose U G E[X \ Q\ and V G E[Y \ Q\. From Lemma 4.3(iv), 
e{Y, Q) < \\Y - U\\ Q < \\X - U\\ Q V \\X - Y\\ g = e(X, Q) V \\X - Y\\ g 
and 

e(X, Q) < \\X - V\\ g < \\Y -V\\ g V \\X - Y\\ g = e(Y, Q) V \\X - Y\\ g . 
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It follows that e(X,Q) = e(Y,G) on the event M := {||X - Y\\g < 
e(X,Q) Ve(Y,g)} and 

e(X, g) = \\Y-U\\ g = \\X -U\\ g = e(X, Q) 

and 

e(Y, Q) = \\X -V\\ g = \\Y - V\\ g = e(Y, Q) 

on M. 

By Proposition 6.1, U 1 M e E[Y 1 M \Q]= 1mE[Y I Q\ and V 1 M e 
E[X 1 M \Q]= 1mE[X I Q\. Thus 1 M E[X I Q\ = 1 M E[Y \ Q\. 

Furthermore, on the event N := {||X - Y\\g > e(X, Q) V e(Y, Q)} 

\\U - VIU < \\U - V \\X - Y iu V \\Y - v^iu 

<e(x,g)y\\x-Y\\ 00 ye(Y,g) 

— \\X — ^||oo, 

and so \\U In — V 1at||oo < \\X In — Y 1jv||oo < 11-^" — ^||oo- Therefore, 

D H ( 1 N E[X | Q\, 1 N E[Y | 0]) < \\x - r ||oo. 

By Proposition 6.1(iii), E[X\Q] = 1 M E[X\Q] + l N E[X\g], and 
similarly for Y. The result now follows from Lemma 8.2. □ 

9. Martingales 

Definition 9.1. Let {jF n }^L be a filtration of sub-u-fields of T . A 
sequence of random variables {X n }^L is a martingale if there exists 
X G L°° such that X n G E[X | JF„] for all n (in particular, X„ G 

Remark 9.2. Note that our definition does not imply that X n G 
E[X n+ i | Tr\ for all n. For example, suppose that T n '■= for 
all n but X is not almost surely constant, then we obtain a martingale 
by taking X n to be any constant in the ball E[X], but we only have 
X n G E[X n+1 | JF n ] for all n if X = X ± = X 2 = . . .. 

Many of the usual real-valued examples of martingales have K-valued 
counterparts. 

Example 9.3. Let {Y n }^ =0 be a sequence of independent random vari- 
ables in L°° with K G E[Y n ] for all n. Suppose that YlkLo ^ converges 
in L°° (by the strong triangle inequality and the completeness of L°°, 
this is equivalent to lim^oo ||^ n ||oo = 0). Set T n := u{Y , Y ± , . . . , Y n }. 
Put X n := J2^ =0 Y k and X n := Ylk^o^k ^ follows from the second 
claim of Proposition 6.1 (i) that X n G E[X | JF„] for all n and hence 
{X n }%L is a martingale. 
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Example 9.4. Let {F„}^L be a sequence of independent random 
variables in L°° with 1 K G E[F n ] for all n. Suppose that n^Lo^ 
converges in L°° (by the strong triangle inequality and the complete- 
ness of this is equivalent to lim^oo — Ik I loo — 0). Set 
T n := a{Y ,Y u ...,Y n }. Put X n := ]\ n k=Q Y k and X := ]\Zo Y k- It 
follows from the first claim of Proposition 6.1(i) that X n G E[X | JF„] 
for all n and hence {X n }^i is a martingale. 

Example 9.5. Let {Z n }%L be a discrete time Markov chain with 
countable state space E and transition matrix P. Set T n : = 
a{Z , Zi, . . . , Z n }. Say that / : i? — > EC is harmonic if / is bounded 
and for all % G .E the expectation of / with respect to the probability 
measure P(i, •) contains f(i) (that is, if f(i) is belongs to the smallest 
ball containing the set {f(j) : P{i,j) > 0}). Fix N G {0,1,2,...}. 
Then {X n }™ =0 := {f(Z nAN )}%> =0 is a martingale. 

10. Optional sampling theorem 

Theorem 10.1. Let {J r n } c ^ =0 be a filtration. Suppose that X G L°° 
and {X n }™ = Q is a martingale with X n G E[X | ^-" n ] /or a// n. If T is a 
stopping time, then Xt G E[X | JF t ]. 

Proof. It follows from Lemma 4.5 that 1{T = n}E[X | jF r ] = 1{T = 
n}E[X | jF n ] and hence, by Proposition 6.1 (iii) , 

E[X\F T }=E J2 X1 i T = u }\Ft 

n 

= ^l{T = n}E[X\F T ] 

n 

= J2HT = n}E[X\F n } 

n 

3 HT = n}X n 



n 

x. 



r- 



□ 



11. Martingale convergence 

Theorem 11.1. Let {J r n }'^L be a filtration. Suppose that X G L°° and 
{X n }^L is a martingale with X n G E[X \ T r \ for all n. If X is in the 
closure of IJ^Li L°°(J-' n ), then lim n _ >00 \\X n — X^ = (in particular, 
{X n }^L converges to X almost surely). 
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Proof. Since X is in the closure of U^Li L°°(J-' n ), for each e > there 
exists Y E L^^n) for some N such that \\X — Y^ < e. Because 
Fn C T n for n > N, Y e L 00 ^) for n > N. 

By Theorem 8.3, D H (E[X \ F n ], E[Y \ F n \) < e for n > N. However, 
E[Y | Tr\ consists of the single point Y, and so the Hausdorff distance 
is simply sup{\\W - Y^ : W e E[X \ F n }}. Thus 

\\x n — x\\oo < \\x n — y||oo v \\y ~ xWoo < e 

for n > N. □ 
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